Voice Command II: A DSP Implementation of Robust Speech Recognition in Real-World Noisy Environments

نویسندگان

Soo-Young Lee

Doh-Suk Kim

Ki-Hwan Ahn

Jae-Hoon Jeong

Hoon Kim

Jong-Seok Lee

Hee-Youn Lee

چکیده

The \Voice Command" system, designed for isolated word recognition tasks in real-world noisy environments, was implemented on a xed-point DSP board to operate in real-time. Simple auditory model, i.e., zero-crossings with peak amplitudes (ZCPA) model, is used for noise-robust feature extraction , and neural network classiier recognizes input patterns. The system performance is further improved by incorporating speaker adaptation and out-of-vocabulary word rejection capabilities. The radial basis function (RBF) classiier provides better rejection performance than multi-layer perceptron (MLP) classiiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC) in a speech emotion recognition system. We investigate its perfor...

متن کامل

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Performance Evaluation of CMN for Mel-LPC based Speech Recognition in Different Noisy Environments

This study is intended to develop a noise robust distributed speech recognizer for real-world applications by employing Cepstral Mean Normalization (CMN) for robust feature extraction. The main focus of the work is to cope with different noisy environments. To realize this objective, Mel-LP based speech analysis has been used in speech coding on the linear frequency scale by applying a first-or...

متن کامل

Robust voice activity detection using perceptual wavelet-packet transform and Teager energy operator

In this letter, a robust voice activity detection (VAD) algorithm is presented. This proposed VAD algorithm makes use of the perceptual wavelet-packet transform and the Teager energy operator to compute a robust parameter called voice activity shape for VAD. The main advantage of this algorithm is that the preset threshold values or a priori knowledge of the SNR usually needed in conventional V...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Voice Command II: A DSP Implementation of Robust Speech Recognition in Real-World Noisy Environments

نویسندگان

چکیده

منابع مشابه

Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Improving the performance of MFCC for Persian robust speech recognition

Performance Evaluation of CMN for Mel-LPC based Speech Recognition in Different Noisy Environments

Robust voice activity detection using perceptual wavelet-packet transform and Teager energy operator

عنوان ژورنال:

اشتراک گذاری